Analysis of Boyer-Moore-Horspool string-matching heuristic

نویسندگان

  • Hosam M. Mahmoud
  • Robert T. Smythe
  • Mireille Régnier
چکیده

We investigate the probabilistic behavior of a string-matching heuristic used for searching for the occurrences of a pattern in a random text. Our investigation covers the two cases when the pattern itself is xed or random. Under suitable normalization we show that the total search time is asymptotically normally distributed in the case of xed pattern, whereas in the case of random pattern the distribution of the search time becomes a mixture of degenerate distributions. An instrumental recurrence equation is obtained by shifting the pattern within the text. To handle the sum of dependent random variables appearing in the recurrence, analytic methods based on the behavior of the shift generating function near its dominant singularity in the complex plane are devised to yield moment calculation and the asymptotic distributions. Adaptation of the standard central limit theorem under mixing conditions complements our analytic toolkit.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enhanced Pattern Matching Performance Using Improved Boyer Moore Horspool Algorithm

In computer science, the Boyer–Moore–Horspool algorithm is an algorithm for finding substrings in strings. A pattern matching problem can be classified into software and hardware based on implemental methods. It is important of enhance pattern matching performance. This paper proposes enhanced pattern matching performance using improved Boyer Moore Horspool Algorithm. It combines the determinis...

متن کامل

On obtaining the Boyer-Moore string-matching algorithm by partial evaluation

We present the first derivation of the search phase of the Boyer-Moore stringmatching algorithm by partial evaluation of an inefficient string matcher. The derivation hinges on identifying the bad-character-shift heuristic as a bindingtime improvement, bounded static variation. An inefficient string matcher incorporating this binding-time improvement specializes into the search phase of the Hor...

متن کامل

Deriving the Boyer-Moore-Horspool algorithm

The keyword pattern matching problem has been frequently studied, and many different algorithms for solving it have been suggested. Watson and Zwaan in the early 1990s derived a set of well-known solutions from a common starting point, leading to a taxonomy of such algorithms. Their taxonomy did not include a variant of the Boyer-Moore algorithm developed by Horspool. In this paper, I present t...

متن کامل

Approximate Boyer-Moore String Matching

The Boyer-Moore idea applied in exact string matching is generalized to approximate string matching. Two versions of the problem are considered. The k mismatches problem is to find all approximate occurrences of a pattern string (length m) in a text string (length n) with at most k mismatches. Our generalized Boyer-Moore algorithm is shown (under a mild independence assumption) to solve the pro...

متن کامل

Implementation of exact-pattern matching algorithms using OpenCL and comparison with basic version

In big text-processing tasks, the exact patternmatching problem still remains time consuming. As algorithms asymptotically faster than existing ones cannot be developed, there is a need to use another approach to promote efficiency. Thus, parallel computing is able to significantly speed up the process of the exact pattern-matching problem solving. That is why the current work is focused on par...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Random Struct. Algorithms

دوره 10  شماره 

صفحات  -

تاریخ انتشار 1997